Temporal Envelopes in Sine-Wave Speech Recognition
نویسنده
چکیده
There is a long debate on the relative importance of spectral and temporal cues in speech perception theories. On the one hand, the highly-intelligible sine-wave speech (SWS) has been viewed as a representation of the global spectral structure of the speech signal. On the other hand, there is accumulating evidence showing that the temporal aspects of speech without spectral details provide sufficient speech understanding. The present study explored whether the temporal envelopes imbedded in the SWS contribute to its intelligibility. In the experiments, both SWS and natural speech signals were processed with noise and tone vocoders to remove the spectral details but to preserve the temporal envelopes. Twenty-two normal-hearing, native English-speaking adult listeners participated in sentence recognition tasks. Speech recognition performance of vocoder-processed SWS was slightly inferior to that of vocoder-processed natural speech but both reached plateau performance at 6-8 channels. Acoustic analysis further indicated that the temporal envelopes of the SWS were almost identical to those of the natural speech, with a mean correlation coefficient r = 0.949 across all sentences. The results provide strong evidence that the SWS represents both spectral and temporal structures of the speech and that the temporal envelopes imbedded in SWS carry important information for speech recognition.
منابع مشابه
Learning to perceptually organize speech signals in native fashion.
The ability to recognize speech involves sensory, perceptual, and cognitive processes. For much of the history of speech perception research, investigators have focused on the first and third of these, asking how much and what kinds of sensory information are used by normal and impaired listeners, as well as how effective amounts of that information are altered by "top-down" cognitive processes...
متن کاملFactors Affecting the Intelligibility of Sine-Wave Speech
Studies on sine-wave speech (SWS) perception suggest that formants contain sufficient information for sentence intelligibility. This study further investigated the effects of amplitude modulation, number of sine-waves, and vowel resonance in SWS recognition. Results showed that Mandarin sentences synthesized using frequency trajectories of the first two formants were highly intelligible with ad...
متن کاملSine-wave speech recognition in a tonal language.
It is hypothesized that in sine-wave replicas of natural speech, lexical tone recognition would be severely impaired due to the loss of F0 information, but the linguistic information at the sentence level could be retrieved even with limited tone information. Forty-one native Mandarin-Chinese-speaking listeners participated in the experiments. Results showed that sine-wave tone-recognition perf...
متن کاملThe Relationship between Speech Perception and Auditory Organisation: Studies with Spectrally Reduced Speech
Listeners are remarkably adept at recognising speech that has undergone extensive spectral reduction. Natural speech can be reproduced using as few as three time-varying sinusoids mimicking the corresponding speech formants. Untrained listeners are able to transcribe this `sine-wave' speech with a high degree of reliability. Phonetic percepts generated by sine-wave speech occur despite an appar...
متن کاملModelling the recognition of spectrally reduced speech
Jon Barker and Martin Cooke fj.barker,[email protected] Department of Computer Science, University of She eld, She eld, UK ABSTRACT Progress in robust automatic speech recognition may bene t from a fuller account of the mechanisms and representations used by listeners in processing distorted speech. This paper reports on a number of studies which consider how recognisers trained on clean ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016